77 research outputs found
Multi-Phase Multi-Objective Dexterous Manipulation with Adaptive Hierarchical Curriculum
Dexterous manipulation tasks usually have multiple objectives, and the
priorities of these objectives may vary at different phases of a manipulation
task. Varying priority makes a robot hardly or even failed to learn an optimal
policy with a deep reinforcement learning (DRL) method. To solve this problem,
we develop a novel Adaptive Hierarchical Reward Mechanism (AHRM) to guide the
DRL agent to learn manipulation tasks with multiple prioritized objectives. The
AHRM can determine the objective priorities during the learning process and
update the reward hierarchy to adapt to the changing objective priorities at
different phases. The proposed method is validated in a multi-objective
manipulation task with a JACO robot arm in which the robot needs to manipulate
a target with obstacles surrounded. The simulation and physical experiment
results show that the proposed method improved robot learning in task
performance and learning efficiency.Comment: Accepted by the Journal of Intelligent & Robotic System
Stable In-hand Manipulation with Finger Specific Multi-agent Shadow Reward
Deep Reinforcement Learning has shown its capability to solve the high
degrees of freedom in control and the complex interaction with the object in
the multi-finger dexterous in-hand manipulation tasks. Current DRL approaches
prefer sparse rewards to dense rewards for the ease of training but lack
behavior constraints during the manipulation process, leading to aggressive and
unstable policies that are insufficient for safety-critical in-hand
manipulation tasks. Dense rewards can regulate the policy to learn stable
manipulation behaviors with continuous reward constraints but are hard to
empirically define and slow to converge optimally. This work proposes the
Finger-specific Multi-agent Shadow Reward (FMSR) method to determine the stable
manipulation constraints in the form of dense reward based on the state-action
occupancy measure, a general utility of DRL that is approximated during the
learning process. Information Sharing (IS) across neighboring agents enables
consensus training to accelerate the convergence. The methods are evaluated in
two in-hand manipulation tasks on the Shadow Hand. The results show FMSR+IS
converges faster in training, achieving a higher task success rate and better
manipulation stability than conventional dense reward. The comparison indicates
FMSR+IS achieves a comparable success rate even with the behavior constraint
but much better manipulation stability than the policy trained with a sparse
reward
Curriculum-based Sensing Reduction in Simulation to Real-World Transfer for In-hand Manipulation
Simulation to Real-World Transfer allows affordable and fast training of
learning-based robots for manipulation tasks using Deep Reinforcement Learning
methods. Currently, Sim2Real uses Asymmetric Actor-Critic approaches to reduce
the rich idealized features in simulation to the accessible ones in the real
world. However, the feature reduction from the simulation to the real world is
conducted through an empirically defined one-step curtail. Small feature
reduction does not sufficiently remove the actor's features, which may still
cause difficulty setting up the physical system, while large feature reduction
may cause difficulty and inefficiency in training. To address this issue, we
proposed Curriculum-based Sensing Reduction to enable the actor to start with
the same rich feature space as the critic and then get rid of the
hard-to-extract features step-by-step for higher training performance and
better adaptation for real-world feature space. The reduced features are
replaced with random signals from a Deep Random Generator to remove the
dependency between the output and the removed features and avoid creating new
dependencies. The methods are evaluated on the Allegro robot hand in a
real-world in-hand manipulation task. The results show that our methods have
faster training and higher task performance than baselines and can solve
real-world tasks when selected tactile features are reduced
A Multi-Agent Approach for Adaptive Finger Cooperation in Learning-based In-Hand Manipulation
In-hand manipulation is challenging for a multi-finger robotic hand due to
its high degrees of freedom and the complex interaction with the object. To
enable in-hand manipulation, existing deep reinforcement learning based
approaches mainly focus on training a single robot-structure-specific policy
through the centralized learning mechanism, lacking adaptability to changes
like robot malfunction. To solve this limitation, this work treats each finger
as an individual agent and trains multiple agents to control their assigned
fingers to complete the in-hand manipulation task cooperatively. We propose the
Multi-Agent Global-Observation Critic and Local-Observation Actor (MAGCLA)
method, where the critic can observe all agents' actions globally, and the
actor only locally observes its neighbors' actions. Besides, conventional
individual experience replay may cause unstable cooperation due to the
asynchronous performance increment of each agent, which is critical for in-hand
manipulation tasks. To solve this issue, we propose the Synchronized Hindsight
Experience Replay (SHER) method to synchronize and efficiently reuse the
replayed experience across all agents. The methods are evaluated in two in-hand
manipulation tasks on the Shadow dexterous hand. The results show that SHER
helps MAGCLA achieve comparable learning efficiency to a single policy, and the
MAGCLA approach is more generalizable in different tasks. The trained policies
have higher adaptability in the robot malfunction test compared to the baseline
multi-agent and single-agent approaches.Comment: Submitted to ICRA 202
Learn and Transfer Knowledge of Preferred Assistance Strategies in Semi-autonomous Telemanipulation
Enabling robots to provide effective assistance yet still accommodating the
operator's commands for telemanipulation of an object is very challenging
because robot's assistive action is not always intuitive for human operators
and human behaviors and preferences are sometimes ambiguous for the robot to
interpret. Although various assistance approaches are being developed to
improve the control quality from different optimization perspectives, the
problem still remains in determining the appropriate approach that satisfies
the fine motion constraints for the telemanipulation task and preference of the
operator. To address these problems, we developed a novel preference-aware
assistance knowledge learning approach. An assistance preference model learns
what assistance is preferred by a human, and a stagewise model updating method
ensures the learning stability while dealing with the ambiguity of human
preference data. Such a preference-aware assistance knowledge enables a
teleoperated robot hand to provide more active yet preferred assistance toward
manipulation success. We also developed knowledge transfer methods to transfer
the preference knowledge across different robot hand structures to avoid
extensive robot-specific training. Experiments to telemanipulate a 3-finger
hand and 2-finger hand, respectively, to use, move, and hand over a cup have
been conducted. Results demonstrated that the methods enabled the robots to
effectively learn the preference knowledge and allowed knowledge transfer
between robots with less training effort
- …